The AliTongyi Lab recently opened sourced a large-scale audio generation model project called FunAudioLLM, aiming to enhance the natural voice interaction experience between humans and Large Language Models (LLMs). The project consists of two core models: SenseVoice and CosyVoice.
CosyVoice focuses on natural voice generation, featuring multi-language support, voice and emotion control functions, and excels in multi-language voice generation, zero-shot voice generation, cross-language voice synt